running head: COUNTING DUPLICATION TREES The Combinatorics of Tandem Duplication Trees

نویسندگان

  • Olivier Gascuel
  • Michael D. Hendy
  • Alain Jean-Marie
  • Robert McLachlan
چکیده

We develop a recurrence relation that counts the number of Tandem Duplication Trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. We find that the number of rooted duplication trees is exactly twice the number of unrooted trees, which means, on average, only two positions for a root on a duplication tree are possible. Using the recurrence we can tabulate these numbers for small values of n. Further we develop an asymptotic formula, that for large n, provides estimates for these numbers. These numbers give a priori probabilities for phylogenies of the repeated sequences to be duplication trees. This extends earlier studies where exhaustive counts of the numbers for small n were obtained. One application showed the significance of finding that most maximum parsimony trees constructed from repeat sequences from Human immunoglobins and T-cell receptors were tandem duplication trees. Those findings provided strong support to the proposed mechanisms of tandem gene duplication. The recurrence relation also suggests efficient algorithms to recognize duplication trees and to generate random duplication trees for simulation. A linear-time recognition algorithm is detailed. [

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The combinatorics of tandem duplication trees.

We developed a recurrence relation that counts the number of tandem duplication trees (either rooted or unrooted) that are consistent with a set of n tandemly repeated sequences generated under the standard unequal recombination (or crossover) model of tandem duplications. The number of rooted duplication trees is exactly twice the number of unrooted trees, which means that on average only two ...

متن کامل

LETTER On Counting Tandem Duplication Trees

Large genomes are full of repeated DNA sequences. It was estimated that over half of the human DNA consists of repeated sequences (Baltimore 2001; Eichler 2001; Leem et al. 2002). Tandem duplication is one of the important evolutionary mechanisms for producing repeated DNA sequences, in which the copies that may or may not contain genes are adjacent along the genome. Fitch (1977) first observed...

متن کامل

Reconstructing the duplication history of tandemly repeated genes.

We present a novel approach to deal with the problem of reconstructing the duplication history of tandemly repeated genes that are supposed to have arisen from unequal recombination. We first describe the mathematical model of evolution by tandem duplication and introduce duplication histories and duplication trees. We then provide a simple recursive algorithm which determines whether or not a ...

متن کامل

The combinatorics of tandem duplication

Tandem duplication is an evolutionary process whereby a segment of DNA is replicated and proximally inserted. The different configurations that can arise from this process give rise to some interesting combinatorial questions. Firstly, we introduce an algebraic formalism to represent this process as a word producing automaton. The number of words arising from n tandem duplications can then be r...

متن کامل

An efficient and accurate distance based algorithm to reconstruct tandem duplication trees

UNLABELLED The problem of reconstructing the duplication tree of a set of tandemly repeated sequences which are supposed to have arisen through unequal recombination, was first introduced by Fitch (1977, Genetics, 86, 93-104), and has recently received a lot of attention. In this paper, we describe DTSCORE, a fast distance based algorithm to reconstruct tandem duplication trees, which is statis...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002